Goto

Collaborating Authors

 linguistic shift




Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings

Dukić, David, Barić, Ana, Čuljak, Marko, Jukić, Josip, Tutek, Martin

arXiv.org Artificial Intelligence

Measuring how semantics of words change over time improves our understanding of how cultures and perspectives change. Diachronic word embeddings help us quantify this shift, although previous studies leveraged substantial temporally annotated corpora. In this work, we use a corpus of 9.5 million Croatian news articles spanning the past 25 years and quantify semantic change using skip-gram word embeddings trained on five-year periods. Our analysis finds that word embeddings capture linguistic shifts of terms pertaining to major topics in this timespan (COVID-19, Croatia joining the European Union, technological advancements). We also find evidence that embeddings from post-2020 encode increased positivity in sentiment analysis tasks, contrasting studies reporting a decline in mental health over the same period.


Why Campaigns to Change Language Often Backfire - Facts So Romantic

Nautilus

In the first decades of the 20th century, people around the world began succumbing to an entirely new cause of mortality. These new deaths, due to the dangers of the automobile, soon became accepted as a lamentable but normal part of modern life. A hundred years later, with 1.25 million people worldwide (about 30,000 in the U.S.) being killed every year in road crashes, there's now an effort to reject the perception that these deaths are normal or acceptable. As reported in the New York Times, a growing number of safety advocates, government officials, and journalists are moving away from the phrase "car accident" on the grounds that it presumes that the drivers involved are blameless--a presumption that is correct only 6 percent of the time, according to a report by the U.S. Department of Transportation. The vast majority of such incidents are caused by drivers who make mistakes, take risks, or drive while distracted or impaired.


The Global Anchor Method for Quantifying Linguistic Shifts and Domain Adaptation

Yin, Zi, Sachidananda, Vin, Prabhakar, Balaji

Neural Information Processing Systems

Language is dynamic, constantly evolving and adapting with respect to time, domain or topic. The adaptability of language is an active research area, where researchers discover social, cultural and domain-specific changes in language using distributional tools such as word embeddings. In this paper, we introduce the global anchor method for detecting corpus-level language shifts. We show both theoretically and empirically that the global anchor method is equivalent to the alignment method, a widely-used method for comparing word embeddings, in terms of detecting corpus-level language shifts. Despite their equivalence in terms of detection abilities, we demonstrate that the global anchor method is superior in terms of applicability as it can compare embeddings of different dimensionalities. Furthermore, the global anchor method has implementation and parallelization advantages. We show that the global anchor method reveals fine structures in the evolution of language and domain adaptation. When combined with the graph Laplacian technique, the global anchor method recovers the evolution trajectory and domain clustering of disparate text corpora.


The Global Anchor Method for Quantifying Linguistic Shifts and Domain Adaptation

Yin, Zi, Sachidananda, Vin, Prabhakar, Balaji

Neural Information Processing Systems

Language is dynamic, constantly evolving and adapting with respect to time, domain or topic. The adaptability of language is an active research area, where researchers discover social, cultural and domain-specific changes in language using distributional tools such as word embeddings. In this paper, we introduce the global anchor method for detecting corpus-level language shifts. We show both theoretically and empirically that the global anchor method is equivalent to the alignment method, a widely-used method for comparing word embeddings, in terms of detecting corpus-level language shifts. Despite their equivalence in terms of detection abilities, we demonstrate that the global anchor method is superior in terms of applicability as it can compare embeddings of different dimensionalities. Furthermore, the global anchor method has implementation and parallelization advantages. We show that the global anchor method reveals fine structures in the evolution of language and domain adaptation. When combined with the graph Laplacian technique, the global anchor method recovers the evolution trajectory and domain clustering of disparate text corpora.


How Language Helps Erase the Tragedy of Millions of Road Deaths - Facts So Romantic

Nautilus

In the first decades of the 20th century, people around the world began succumbing to an entirely new cause of mortality. These new deaths, due to the dangers of the automobile, soon became accepted as a lamentable but normal part of modern life. A hundred years later, with 1.25 million people worldwide (about 30,000 in the U.S.) being killed every year in road crashes, there's now an effort to reject the perception that these deaths are normal or acceptable. As reported in a recent New York Times article, a growing number of safety advocates, government officials, and journalists are moving away from the phrase "car accident" on the grounds that it presumes that the drivers involved are blameless--a presumption that is correct only 6 percent of the time, according to a report by the U.S. Department of Transportation. The vast majority of such incidents are caused by drivers who make mistakes, take risks, or drive while distracted or impaired.